5 research outputs found
CIMTDetect: A Community Infused Matrix-Tensor Coupled Factorization Based Method for Fake News Detection
Detecting whether a news article is fake or genuine is a crucial task in
today's digital world where it's easy to create and spread a misleading news
article. This is especially true of news stories shared on social media since
they don't undergo any stringent journalistic checking associated with main
stream media. Given the inherent human tendency to share information with their
social connections at a mouse-click, fake news articles masquerading as real
ones, tend to spread widely and virally. The presence of echo chambers (people
sharing same beliefs) in social networks, only adds to this problem of
wide-spread existence of fake news on social media. In this paper, we tackle
the problem of fake news detection from social media by exploiting the very
presence of echo chambers that exist within the social network of users to
obtain an efficient and informative latent representation of the news article.
By modeling the echo-chambers as closely-connected communities within the
social network, we represent a news article as a 3-mode tensor of the structure
- and propose a tensor factorization based method to
encode the news article in a latent embedding space preserving the community
structure. We also propose an extension of the above method, which jointly
models the community and content information of the news article through a
coupled matrix-tensor factorization framework. We empirically demonstrate the
efficacy of our method for the task of Fake News Detection over two real-world
datasets. Further, we validate the generalization of the resulting embeddings
over two other auxiliary tasks, namely: \textbf{1)} News Cohort Analysis and
\textbf{2)} Collaborative News Recommendation. Our proposed method outperforms
appropriate baselines for both the tasks, establishing its generalization.Comment: Presented at ASONAM'1
A study of readability of texts in Bangla through machine learning approaches
Abstract In this work, we have investigated text readability in Bangla language. Textreadability is an indicator of the suitability of a given document with respect to a targetreader group. Therefore, text readability has huge impact on educational contentpreparation. The advances in the field of natural language processing have enabledthe automatic identification of reading difficulty of texts and contributed in the designand development of suitable educational materials. In spite of the fact that, Bangla isone of the major languages in India and the official language of Bangladesh, theresearch of text readability in Bangla is still in its nascent stage. In this paper, we havepresented computational models to determine the readability of Bangla text documentsbased on syntactic properties. Since Bangla is a digital resource poor language,therefore, we were required to develop a novel dataset suitable for automatic identificationof text properties. Our initial experiments have shown that existing Englishreadability metrics are inapplicable for Bangla. Accordingly, we have proceededtowards new models for analyzing text readability in Bangla. We have consideredlanguage specific syntactic features of Bangla text in this work. We have identifiedmajor structural contributors responsible for text comprehensibility and subsequentlydeveloped readability models for Bangla texts. We have used different machinelearningmethods such as regression, support vector machines (SVM) and supportvector regression (SVR) to achieve our aim. The performance of the individual modelshas been compared against one another. We have conducted detailed user survey fordata preparation, identification of important structural parameters of texts and validationof our proposed models. The work posses further implications in the field ofeducational research and in matching text to readers.24 Halama